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■ We study the problem of optimizing the Shannon mutual information for sources of real quantum 

' states i.e. sources for which there is a basis in which all the states have only real components. We 

consider in detail the sources Em of M equiprobable qubit states lying symmetrically around the 
great circle of real states on the Bloch sphere and give a variety of explicit optimal strategies. 
We also consider general real group-covariant sources for which the group acts irreducibly on the 
subset of all real states and prove the existence of a real group-covariant optimal strategy, extending 
a theorem of Davies (E. B. Davies, IEEE. Inf. Theory IT-24, 596 (1978)). Finally we propose 
an optical scheme to implement our optimal strategies, simple enough to be realized with present 
technology. 
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There are two principal measures of quality in the quantum detection problem for a given finite number of quantum 
states with fixed prior probabilities. One is the minimization of a specified Bayes cost, and the other is the maximiza- 
tion of the Shannon mutual information The former is useful if one has to reach a decision after performing 
a single quantum measurement whereas the latter is more relevant for the problem of transmitting as much classical 
information as possible using the given ensemble of states. In this paper we will consider the problem of maximizing 
■ 5 the Shannon mutual information for a certain class of quantum ensembles. 

In a general communication setting, let {a;,; G X} be input letters and let {^} be their prior probabilities. Let us 
denote output letters by {z/j € Y}. Both the Bayes cost and the Shannon mutual information are defined in terms of 
the conditional probability P(j\i) of obtaining output yj provided that the letter sent was Xi. The former is defined 

B(X:Y)=Y / C ij Z i P(j\i), (1) 

ij 

for a Bayes cost matrix [CV,], while the latter is defined as 

7(x ^? & F 0l ' ,1O8 W> <2> 

k 

(Since all the results in this paper are valid for any logarithm base, we shall specify the base only where necessary.) 
In classical information theory, the channel matrix is given and fixed, characterising the noise in the channel. 

In contrast, in a quantum information theoretic context where signal carriers are to be quantum states transmitted 
without noise, the channel matrix generally becomes a variable. This is because the act of quantum detection itself 
generally has a probabilistic output so the channel matrix is dependent on the choice of quantum detection strategy. 
More precisely, the input letters correspond to a set of positive trace class operators of trace one {pi} on a Hilbert 
space H s - A quantum detection strategy is described by a positive operator- valued measure (POVM) on Ti s . A 
POVM is any set {%} of hermitian positive operators forming a resolution of the identity: 
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7r] = TTj , 7Tj > Vj, ^ 7Tj = /. (3) 

j 

The detection operator itj corresponds to the output letter yj and the conditional probabilities are given by 

P(j\i) = Trfcpi). 

Thus in the quantum context the optimization of I(X : Y) is carried out with respect to the choice of POVM {rtj} 
for fixed ensemble £ = {f>i]^i} (i.e. with fixed letter states pi and fixed prior probabilities The maximum value 
of I(X : Y) is called the accessible information of the ensemble £. 

The set V of all POVM's is a convex set and I(X : Y) enjoys the following fundamental property: 
(CONV): For a fixed ensemble £ = {pi\£i}, I(X : Y) is a convex function on V. 

A proof of (CONV) is given in theorem 2.7.4 of Q. Let I{£ : A) denote the mutual information obtained from the 
POVM A applied to the ensemble £. Then if A is a convex combination of POVMs A%. 

A = piAi H p n A n . 

it follows from (CONV) that 

I(£ : A) < *ypil(£ : Ai) < max/(£ : A t ). (4) 

The Bayes cost B(X : Y) is an affine concave function on the convex set V . Therefore the Bayes cost minimization 
problem is a kind of linear programming problem and is expected to have a unique solution. A necessary and sufficient 
condition for specifying the optimum solution is known (ijj^. On the other hand, the Shannon mutual information 
I(X : Y) is a nonlinear and convex function on V. The maximization of this quantity is a much harder problem and 
only a necessary condition for the optimum is known Thus the maximization of I(X : Y) with respect to the 
detection strategy {7Tj} is a basic and open problem in quantum information theory. 

In this problem, the number of outputs is not necessarily the same as the number of the inputs. The optimum 
solution is not necessarily unique cither. However it is known that there must be at least one optimum solution which 
corresponds to an extreme point of the convex set V . This is due to the convexity of the function 1[X : Y). Such an 
extreme point is a set of rank one elements, which means that each %j has the form k \v) (v\ where \v) is a pure state 
and < k < 1. The number of elements, N, can be bounded by d < N < d 2 where d is the dimension of the Hilbert 
space TL S of which the input state ensemble {pi} is made ||. I(X : Y) is also possibly maximized at some interior 
points of V as well. In that case the number of outcomes may exceed d 2 . Explicit examples of optimal solutions have 
been given for binary ensembles and for the ensemble of four qubit states with tetrahedral symmetry |B| . The 

latter is a specific example of a general result of Davies || characterising the form of an optimal strategy for any 
symmetrical ensemble whose symmetry group acts irreducibly on the whole state space. 

In this paper we will study the accessible information and corresponding optimal strategies for an ensemble £m of 
M qubit states with symmetry group Zj\/, the group of integers modulo M. Some of our results will also apply to 

more general ensembles. £m may be explicitly described as follows. Let {(0)1(1)} be the z-spin eigenstates and 



write jf/'o) = ( q ) ■ Let 



Then £-\t consists of the M states 



V _cx Pl t M a y )-\ s . n ^ cog ^ I. W 



m = V k \^)=( C ™ k £), fe = 0,...,M-l, (6) 

\ bm M J 

taken with equal prior probabilities ik = jj- Note that these states (in the 2-spin basis) involve only real components. 
On the Bloch sphere they are equally spaced around a great circle C in the x — z plane consisting of all real states. 
The antipodal points which have C as equator, are the two a y eigenstates. Thus £m is clearly symmetrical with 
respect to the group Zm whose generator is represented by jj rotation about the axis joining the a y eigenstates. At 

the Hilbert space level the operators V k in Eq. (||) provide a projective unitary representation of Zm (e.g. V M = —I 
and c.f. Eq. (0) later). 
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This symmetry group does not act irreducibly on the whole state space. Indeed the a v eigenstates are left invariant 
by the group action. (Irreducibility on the whole state space requires that the only invariant point is the maximally 
mixed state Hence we cannot apply Davies' theorem || to provide an optimal strategy for Em- Nevertheless we 
will prove that the conclusion of Davies' theorem remains true in this case i.e. that there exists a pure state |ao) such 
that the Z M -symmetric POVM 

Am = {-^r \a k ) (a k \ : k = 0, . . . , M - 1} where \a k ) = V k \a ), 

is an optimal strategy for Em- Furthermore we will show that |ao) may be taken to be the state orthogonal t o l^o) . 

The case M = 3 is of particular interest. It is the so-called trine ensemble which has been much studied |]9], |l0| , |il| . 
Holevo in 1973 Q showed that no von Neumann measurement in H.2 can be an optimal strategy, demonstrating the 
necessity of considering more general POVMs in quantum detection theory. Since that time it has been conjectured 
that the strategy A3 above is optimal for the trine source. Our results resolve this conjecture affirmatively. 

The strategy Am has M elements. However, as noted above, for ensembles in d = 2 dimensions there is always an 
optimal strategy with at most d 2 = 4 elements (which does not increase with M). We will show that the ensembles 
Em always have an optimal strategy with at most 3 elements and explicit strategies of this form will be described for 
all M. If M is even then Em consists of 4p pairs of orthogonal states. Let , |ry)} be any one of these pairs. We 
will show that the two-element POVM (£| , \rj) (r]\} (a regular von Neumann measurement) is always an optimal 
strategy when M is even. We will also describe further optimal A'-element POVMs where K lies between 3 and M . 



II. A GROUP-THEORETIC APPROACH 



We begin by setting up a group-theoretic formalism for symmetric ensembles, leading to a main result (theorem 
1) which applies to symmetric ensembles of real states in any dimension d > 2. An essential requirement in many of 
our results will be that various states and unitary operators be real. The requirement that a state or operator be real 
has of course, no intrinsic physical meaning. When we speak of real states and real operators we will always mean 
simply that there exists a basis of the Hilbert space relative to which all the required objects simultaneously have real 
components or real matrix elements. 

A projective unitary representation of a group G is an assignment of a unitary operation U g to each member of G 
satisfying 

7T 77 — p*0(9l,S2)/7 cj\ 

where the phases </>(gi, (72) may be chosen arbitrarily. A finite ensemble £ of equiprobable (generally mixed) states is 
said to be symmetric with respect to the group G, or G-covariant, if the following condition is satisfied: there is a 
projective unitary representation {U g } of G such that for all g, U g pW g is in £ whenever p is in £. We write 

gp = U g pUl (8) 

for the action of g on the state p. The phases (f>(gi,g2) do not appear in Eq. (||) and <?i(<72(/5)) = (gig2)(p). Note 
that, in contrast to Davies J5| we do not require that G parameterises £ i.e. G need not act transitively on the set of 
states of £ . For example, £m is ZM-covariant and the action is transitive, but <?27V is also Z2- and Zjy-covariant via 
non-transitive actions. 

A G-covariant POVM A (for the projective unitary representation {U g }) is a POVM such that U g AU^ is in A 
whenever A is in A. We write 

gA=U g AUl (9) 

for the action of g on a POVM element A. From Eqs. (|) and (|) we see that Tr (Ap) = Tr (gA ■ gp) i.e. the 
probability of outcome A on state p is G-invariant. Hence 

Tr(gA-p)=Tr(A-g- 1 p), (10) 

so that the set of probabilities of the G-shifted outputs gA on a fixed input p are obtained as a permutation of the 
set of probabilities of the unshifted output A acting on suitably shifted inputs. 
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Let £ be a G-covariant ensemble with projective unitary representation {U g }. We aim to find conditions on {U g } 
which will guarantee the existence of a G-covariant POVM A = {A g : g £ G} with elements parameterised by G, and 
having group action gAh = A g h- Thus if e is the identity of G we have 

Ag = U g A e Ul (11) 

and we require 



M = ^ A g = I. (12) 

96G 



(Later we will take the elements of A to be rank 1 and consider the question of when A is an optimal strategy for £.) 
From Eq. ([TT| ) we see that M commutes with all the t/ g 's: 

U g M = MU g . (13) 

Thus if the set {U g } acts irreducibly on the state space (i.e. there is no proper invariant subspace) the Schur's 
lemma will guarantee that Eq. ( |l2| ) holds. This fact is used by Davies || to characterise an optimal strategy for any 
G-covariant ensemble whose symmetry group acts irreducibly on the whole state space. However this condition of full 
irreducibility on the whole state space is not necessary for Eq. (|l^) to hold. We will use the following more general 
form of Schur's lemma: 

Lemma 1: Let {M g } be any set of non-singular d by d matrices over some field F which acts irreducibly on the 
vector space V — F d (i.e. there is no proper subspace mapped to itself by all the M g s). Suppose that K is any 
matrix that commutes with all the M g s: 

KMg = MgK. (14) 

Then: 

(a) either K = or K is non-singular, 

(b) If K has a non-zero eigenvalue A in F, then K = XI. 

Proof: (a) Let K(V) denote the image of V under the map K and similarly for M g (V). Since M g is non-singular 
we have M g (V) = V. By Eq. @ we have M g K(V) = KM g (V) = K(V) i.e. K(V) is an invariant subspace. Hence 
either K(V) — (in which case K = 0) or else K(V) — V (in which case K is non-singular). 

(b) If K has eigenvalue A in F then B = K — XI is singular. Also BM g = M g B for all g. Hence by (a), B must be 
zero i.e. K = XI. ■ 

We will apply this lemma with F = 1R to obtain useful results about G-covariant ensembles of real states whose 
group G acts irreducibly only on the restricted set TR d of real states (but not necessarily irreducibly on the full state 
space). This is the case for our ensembles Em- Let |G| denote the size of G and let d = Trj be the dimension of the 
Hilbert space. 

Lemma 2: Suppose that {U g } is a projective unitary representation of G such that U g are all real matrices and 
{U g } acts irreducibly on TR d . Let \v) 6 IR d be any real state. Write 

A g = ^U g \v) (v\Ul 

Then {A g : g € G} is a G-covariant POVM i.e. E geG K = I - 

Proof: Let M = J2 g ec^9- Then M is a real matrix and MU g = U g M for all g € G. Also M is a hermitian 
positive matrix (being a sum of projectors with positive coefficients) so it has a real positive eigenvalue A > 0. By 
the previous lemma, M = XI. Since Tr A g — for all g, we get TrM = d = Ttl so A = 1. ■ 

Theorem 1: Let £ be any ensemble of equiprobable real states in dimension d. Suppose that £ is G-covariant with 
respect to a projective unitary representation {U g } of real matrices which acts irreducibly on TR d . Then there exists 
a real pure state \v) such that the G-covariant POVM V — {D g : g G G} defined by 



d_ 

\G\ 

is an optimal strategy for £ . 



D g = 7^U g \v) (v\Ul 
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Proof: We will work in the basis with respect to which the states off and the matrices U g have real entries. Let A = 
{Ai, . . . , A n } be any optimal POVM for £. We will transmogrify A into the required form while preserving optimality. 
First strip off all imaginary parts of the entries of the matrices Let Ak = Re(Ak) and A — {A\, . . . ,A n }. Then 
A is again a POVM and has real symmetric matrices as elements. (To see that Ak is a positive matrix note that Ak 
positive implies that the complex conjugate A* k is positive so Ak = \{Ak + A k ) must be positive. Also ^Ak — I and 
/ is real so ^ Ak — I too.) Next note that Tr Akf> = Tr Akf) for any real state p (since Im(Ak) is antisymmetric) so 
A remains an optimal strategy. 

In general A will not have rank 1 elements even if A had rank 1 elements. Thus decompose each Ak into its rank 1 
eigenprojectors (multiplied by the corresponding eigenvalues) which are necessarily real as the eigenvalues/vectors of 
any real symmetric matrix are real. Then form the larger POVM B — {Bi, ■ ■ ■ , B m } comprising all the scaled rank 1 
eigenprojectors above. Such a refinement of a POVM can never decrease the mutual information so B with real rank 
1 elements, is still optimal. 

Now look at 

Ckg = -n^gBk for g £ G and k = 1, . . . , m. (15) 

|Cr| 

Note that ^2 k g ^kg — I since B^ = I and gl = I for all g. Let C — {Ckg} be the corresponding POVM with \G\m 
elements. Thus C is G-covariant but the action of G is not transitive. We finally aim to cut down C to a smaller 
optimal G-covariant POVM with elements labeled by G. 

Let I(£ : A) denote the mutual information obtained from any POVM A applied to any ensemble £. First we 
show that I{£ : C) — I{£ : B) so that C remains optimal. Let us label the inputs by i S T and denote conditional 
probabilities for C by P(kg\i). Denote the conditional probabilities for B by Pg(fc|i) and let £ be the constant prior 
input probability. Then 

P(kg\i) = TrC kg Pi = j— TrgB k ■ pi- 

According to Eq. ([!(]), for each fixed g and k the resulting set of probabilities labelled by i £ I, will be just a 
permutation of the set Ps{k\i) 1 rescaled by t^t. Thus 



will be independent of g and also 



will be independent of g. The mutual informations I(£ : C) and I(£ : B) are given by (c.f. Eq. (0)): 

i kg 



/(£:s) = Ee£p B (fcN)iog- Psm 



On substituting the above G-invariant expressions into I{£ : C) we readily get I(£ : C) = I(£ : B). (Our argument is 
actually an explicit example of the claim in lemma 5 of B). Hence C remains optimal. 
Finally note that for each i, Bi/ '(Tr Bi) is a real pure state so by lemma 2, 

is a POVM for each i. Now Tr d Bi T>i — {-^gBi : g S G} so C is a convex combination 
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„ A Tr4 

i=l 

Hence by Eq. @) 

I{£ : C) < maxl(£ : VA. 

i 

Since C was optimal it follows that at least one of the 2Vs is optimal. This gives an optimal G-covariant POVM with 
real rank 1 elements, parameterised by G, completing the proof. ■ 



III. OPTIMAL STRATEGIES FOR £ M 



We now return to the ZM-covariant ensemble £ m in 2 dimensions, comprising the states 

sin ■ 



M 



m = ( I s ^ ) , fc = o,...,M-i, 



with equal prior probabilities jj. According to theorem 1, there must exist an optimal ZM-covariant POVM A = 
{A , . . . , Am_i} with M real rank 1 elements. The elements will have the form Aj — |a,-) (cij| with 

~2~/cos(0+£) 



l«i)=^M = VM^rn(^|)J' ^°,...,M-1, (16) 

and y is given in Eq. (^). The conditional probabilities p(j\k) = \ (aj\tjjk) | 2 may be readily computed and after some 
rearrangement we obtain the mutual information 1(9) explicitly as 

W = m E (! + co < 29 - If )) !°S(1 + cos(20 - — )). (17) 

fe=0 

In this section, the base of the logarithm is taken as e. (For this base the numerical value of Eq. (|l7|) is the amount 
of information in nats rather than bits.) From the symmetry, 1(9) is a periodic function with period j-. Figure 1 
shows numerical plots of 1(9) for M = 2, 3, 4 and 5 and illustrates the following basic property: 

Lemma 3: For each M, 1(9) has a global maximum at 9 = -|. 

Proof: Since | cos(2# — % L )| < 1, 1(0) can be expanded by using the formula 

(l + x)ln(l + x)=x + J2 N<1, (18) 

We get 

fc=0 n=2 v ; 

M ^ n(n - 1) ^ y M ' 

n=2 v ; k=0 

since X^lo* cos(2# — ^jf) — 0. Next we separate out the even and odd parts of the series and replace powers of 
cosines by multiple angle cosines to get: 



1 00 (_1\2n M ~ 1 n h 

m = J-y y cos 2 "(20 - — ) 

w M ^ 2n(2n - 1) ^ v M 

n=l y ' fc=0 

1 00 1 i\2n+l y- 1 ol— 

^S(kri^i:» s2 " +1 ( 29 -^) 

n=l v ' fc=0 



G 



1 — 1 1 f 1 ( 2 



1 ^ (-1)- v i r 



M ^ 2n(2n- 

n=l v 

1 °° (— 1) 2 " +1 If" 

IfE (2n+ l)2n E 22" I E 

n=l v ' fc=0 



fc=0 
M-l 



II 

2 V n 



E 



2n 



(2n-2f)(20- 



2fc7T, 



z=o 



2n + l 
I 



cos 



(2n+ l-2/)(20-— ; 



Then recall that 



M-l 

^ cos ( L 

k=0 



2kir \ _ ( M for L/M = q (integer) 
~M J \ for L/M ^ integer 



Applying this to Eq. (|19|) with L — 2n — 21 and L — 2n + 1 — 2iin the even and odd series, we get: 

\2n / „ \ n-l oo 



1 ( 2n 

2 V n 



^ 2n(2n - 1)2 2 ™- 1 

n— 1 v 7 

E (2n+ l)2n2 2 " [EE 

n=l v y Z=0 <I=0 



EE( T )cos(2^M)d 

i=0 <;=0 ^ ' 



1 I COs(20gM)<5 2 n+l-2i,<?M 



E 



J 2n(2n - 1)2 



2n 



2n 1 I! 



]T/(gM)(-l)" M cos(20gM), 

9=0 



where 



/(<^o = EE 



21 + qM 
I 



n=l 1=0 



(21 + qM)(2l + qM - 1)2 2/ +<? M 



— ($2n-2l,qM + 52n+l-2l,qM) 



(19) 



(20) 



(21) 



(22) 



Since f(qM) > 0, 1(6) is maximized when (-l)« M cos(26»gM) = 1, that is, 9 = § for all M. 




e 

FIG. 1. The Shannon mutual information 1(6) in nats versus the optimization parameter 6 for M —2, 3, 4 and 5. 

Hence in general an optimal strategy for £m consists of choosing a real rank 1 POVM with elements Ak lying in 
directions orthogonal to the input states \i/jk)- This POVM will be denoted by Am- The output Ak signifies with 
certainty that the input was not \ipk) but leaves a residual uncertainty in the remaining signal states. 

For a given ensemble £ the optimal strategy is not unique and in practice it may be of interest to find optimal 
POVMs with the minimum number of elements. The G-covariant optimal POVM above has M elements and we 
note here some ways of reducing this number using the group theoretic approach. In the next section, by different 
methods, we will show that 3 elements always suffice for any real qubit source, and develop corresponding strategies 
for the £a/'s. 
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Lemma 4: Suppose that k ^ 1 divides M exactly. Then there is a Z^-covariant optimal POVM for £m with k real 
rank 1 elements. 

Proof: Since k divides M, Zm has a subgroup isomorphic to and so £m is Z^-covariant. Since k ^ 1, the action 
of Zfc contains a non-trivial rotation so it acts irreducibly on 1R 2 . Thus theorem 1 immediately gives the required 
result .■ 

Remark: Lemma 4 may also be obtained by a convexity argument as follows. We will illustrate the idea with the 
specific example of M = 15 and k — 3. The general case is a straightforward generalisation. Z15 = {0, 1, . . . , 14} has 
the subgroup {0,5, 10} isomorphic to Z3. Let A15 = {Aq, A\, . . . , A14} be the optimal strategy given by theorem 1 
and lemma 3, with the direction of A^ being orthogonal to the k th state of £15. According to lemma 2, the three 
directions 0,5,10 corresponding to the subgroup, may be used to define a POVM. We just need to rescale Aq, A5 and 
A w so that they add up to /. The scaling factor is = 5. Thus Bq — {5A , 5A 5 , 5j4io} is a POVM. Now / is always 
G invariant so we can apply the group elements I = 1, 2, 3 and 4 of Z15 to Bq to obtain POVMs 

Bi = IB = {5i ; , 5A l+5 , 5i ;+ io} for 1 = 0,1, 2, 3, 4. 

Note that the B{s have elements parameterised by the cosets of Z3 in Z15. Also by symmetry of the construction, 
I{£ 15 : B{) is independent of I. Furthermore A15 is a uniform convex combination of the B^s 

A 15 = J2% 

1=0 b 

so by Eq. (|): 

I(£ 15 : A15) < max/(£i5 : Bi). 

Since A\§ was optimal we see that Bi is optimal for each I. This gives the result of lemma 4 and also identifies the 
directions of the k element POVM as being any chosen symmetrical set of k directions orthogonal to corresponding 
states of Sm.M 

An immediate special case is: 

Corollary: If M is even then £m is made up of 4p pairs of orthogonal states. The von Neumann measurement 
defined by any one of these orthogonal pairs is an optimal strategy for £a/-B 

Thus if M is composite we can significantly reduce the number of elements in our optimal strategy but if M is 
prime then this number remains large. In the next section we give a different approach to reducing the number of 
elements, showing that just 3 elements always suffices for any ensemble of real qubit states. 



IV. OPTIMAL POVMS WITH 3 ELEMENTS 

Davies || has shown that any ensemble in d dimensions has an optimal strategy with N elements where d < N < d 2 . 
This is directly based on (CONV), that is, I(X : Y) is a convex function on the convex set V of all POVMs. Because 
of this, I(X : Y) will always take its maximum value at an extreme point of the convex set V (and also possibly at 
some interior points as well). Each extreme point of V consists of N rank 1 elements bounded by d < N < d 2 . If we 
restrict attention to only real ensembles then this upper bound on N can be improved as follows |l^] . 

Lemma 5: Let £ be any ensemble of real states in d dimensions. Then the Shannon mutual information can be 
maximized by a POVM with N elements where d < N < d(d + l)/2. 

Proof: The proof proceeds along the same lines as the original one in ref. Q with a slight replacement. For any 
POVM {TTj} write itj = Hjdwj where Tr7fj = 1 so 

VjKj =I/d, ^2fj,j = l. (23) 

i i 

Let X be the (compact convex) set of all positive hermitian operators with trace 1 (such as the 7fj's). Since I(£ : A) 
is a convex function on the set V of all POVMs its maximum is attained at an extreme point of V . The essential 
point of the original proof in ref. H is that every extreme point of V has D + 1 rank 1 elements where D is the real 
dimension of X . In the case of general ensembles D = d 2 — 1. In our case of real ensembles the members of X and 
V can be restricted to real matrices so X comprises real symmetric trace 1 matrices and D = d( - d ^ — 1. Hence the 
extreme points of V have < d{ ~ d +^ elements.B 
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Thus for the real ensembles Em with d = 2, POVMs with three real elements suffice to provide an optimal strategy. 
To describe such a POVM, we first introduce the three real (un-normalised) vectors 

M = c ( I ) , (24a) 

where the first vector lies along the first basis direction and the remaining two are in general position. Imposing the 
condition Y] - \u>j)(u)j\ = I we get 



c = ^/2-a 2 -b 2 , (25a) 

2 COSi^b 



sin^ a sin(< y 9 Q - tp b ) 



(25b) 



# = V (25c) 

sm(fbsm[(pb - fa) 

and 

< a 2 + b 2 < 2. (25d) 

Once the angles tp a and ipi, have been chosen, a, 6 and c are fixed. Finally we rotate these vectors around the y-axis 
through an angle 9 to make the general POVM with three real rank 1 elements: 

cjj(0) = 1^(6)) (^(6)1 (26a) 
1^.(0)) = V(6)\uj), V{B) = exp(-i6& y ). (26b) 

This gives the most general POVM {uio(9),lui(6),oj2{6)} in terms of three independent parameters <p a , f b and 6. 

We are now in a position to maximize the Shannon mutual information of Em with (at most) three-element POVMs. 
We first give a useful preliminary lemma. 

Lemma 6: Let A — {A^ \a) (a\} be any POVM with rank 1 elements labelled by a where < A a < 1 is real and 

cos 6 a 
sin 6 a 

in the z-spin basis. Then the mutual information for Em is given by 

I(£m:A)=J2y 1 ^ ( 2? ) 

a 

where 1(9) is the function given in Eq. (|l7|). 

Proof: The states \ipk) of Em given in eq. (^|) lead to the conditional probabilities 

A 2 9kw 
P(a\k) = X 2 a \ (Vfc| a ) | 2 = + cos(20 a - — )) 

Substituting these into eq. ^ readily yields the formula eq. ( |27| ) after a little algebra.B 

Theorem 2: The Shannon mutual information of Em (for M > 2) is maximized by the POVM W = {uj* — 



:j = 0,1,2} where 












-( 


V2 - a 2 - 6 2 




= a 


/ -sin(^) \ 
I cos(^) J 


^2> 


= b 


( \ 

\ cos(^) j ' 



(28a) 
(28b) 
(28c) 







and 



a 2 = COs( -m) > (29a) 

sin(^)sin((^) " 

b 2 = C0S( ^f \ . > 0. (29b) 

sin( - )sin((^) " 

Here to and n are any positive integers satisfying 

< a 2 + b 2 < 2. (29c) 



In some cases one of a, & and V2 — a 2 — b 2 is zero and the POVM has only two elements. 

Proof: For the three element POVM W(8, (fi a ,fb) = {<^o(0)i &i(@)> ^2(8)} with rank 1 elements, lemma 6 immedi- 
ately gives 

I{£ M : W) = (1 - y - y)J(0) + + p„) + + p 6 ) 

Hence I(£a/ : W) < maxg 7(6'). By lemma 3 this maximum is /(§), the accessible information oi 8m- Furthermore 
1(9) is periodic in with period 4y. Hence we can achieve I(£m : W) = /(f) by setting 9 = f and choosing 0„ and 
06 to be any integer multiples of jfe. This gives Eqs. (|2^). Eqs. ( |29| ) are just the condition for {to*} to be a POVM. ■ 
From this theorem we can develop various kinds of optimal strategies. We noted previously in corollary 1 that if M 
is even, then there exists an optimal strategy based on a pair of orthogonal directions. This also follows from theorem 
2: if M — AL — 2 with L = 1, 2, . . . then we may take n = 2L — 1 giving a — and a 2-element POVM based on the 



directions 



and 



If M = 4L with L = 1, 2, . . ., we may take m = n = L giving V2 — a 2 — b 2 = and 



an optimal POVM based on the directions 



and 



. In both cases the pair of directions coincides with an 



orthogonal pair of states of £m- 

If M is odd, at least 3 outputs are required. In the case of M = 3 we get an optimum strategy with three elements 
of equal norm. This coincides with our previous result A3 of theorem 1 and lemma 3. The cases of M = 5 and M = 7 
are more interesting. In both cases, the optimum strategies consist of the three elements with the two different norms 
(in contrast to the Z^-covariant strategies of theorem 1). A solution for M — 5 is shown in Fig. |[ The POVM 
elements are represented by the thick solid lines and the dashed lines represent the input states. (Note that, for ease 
of presentation these dashed lines representing the states of Em ~ symmetrically distributed around a whole circle - 
correspond to the vectors (— l) fc \%^k) rather than the original vectors in Eq. (|6|)). According to choices of parameters 
(to, n) in theorem 2, there can be several configurations of the POVM directions. But by the symmetry of £5 they all 
lie in the same position relative to the ensemble as a whole, characterized by a 2 = b 2 = l/(2sin 2 ^) as shown in Fig. 



lp 2 ). 

lav 

W 




Ip4> 

FIG. 2. The optimal POVM directions (thick solid lines) given by theorem 2 in the case of M — 5. The input states are 
represented as (— l) fc \ipk) by the dashed lines whose lengths correspond to a unit state vector. The lengths of the thick solid 
lines are scaled according to the normalization factors of the corresponding POVM elements. 
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Fig. U shows the case of M = 7. There are now two inequivalent classes of POVM element directions. One 
corresponds to a 2 = b 2 = 1/ (2sin 2 ^) where the angle between the two measurement vectors directed downward is 
(the left figure), and the other corresponds to a 2 = b 2 = l/(2sin 2 ^ L ) where the angle between the two measurement 
vectors directed downward is (the right figure). 




lp 2 >. 



leob> 



IPi> 




Ip5> 



IP6> 



FIG. 3. Fhe two inequivalent optimal POVMs in the case of M = 7. The POVM directions and input states are represented 
by thick and dashed lines respectively according to the conventions of Fig. 2. 



Lemma 6 and theorem 2 may be used to provide a further variety of optimal if-element POVMs for £m where K 
is between 3 and M: 

Lemma 7: Let A be any POVM as described in lemma 6 for which all angles 9 a have the form 



7T 7T 

2 +K M 



where k a is an integer 



(30) 



Then A is an optimal strategy for Em- 

Proof: Since 1(8) is periodic with period jj we have I(6 a ) = I(^) for all a. Also J2^t = 2 so that Eq. (|2 
immediately gives I(£m - A) = I(?) i.e. A is optimal.l 

Now note the following facts: 

(a) All POVMs in theorem 2 satisfy Eq. (|3C 

(b) If A = {Ai} is any POVM satisfying Eq. (j3Cj) then any Zm— shifted version Ai of A, defined for each I € Zm by 

At = {V l A t vV} 

is a POVM also satisfying Eq. @. (The angles 9 a are just shifted by |j). 

(c) If A\, ... , -4jv is any list of POVMs satisfying Eq. ((30|) then any convex combination of the AiS will satisfy Eq. 
(p0|). (In forming convex combinations we naturally amalgamate POVM elements from different AiS that lie in the 
same direction.) 

Hence any convex combination of any Zm— shifted versions of the POVMs in theorem 2 will be an optimal strategy. 
For example, let us consider a convex combination between two POVM's in the case of M=5. The following {cuj} is 
one of the optimum detection strategies from theorem 2: 



a 2 f . Air Air 
= —(I - sm(—)a x - cos( — )ct z ), 

W 2 = — (I - Sm^y)^ - COs(y-)cr z ), 



(31a) 
(31b) 

(31c) 



where a 2 = l/(2sin 2 ^). The convex combination between {u>j} and {V 2 UjV> 2 } forms the resolution of the identity 



2 2 

(1-A)5^<2>j ■ +\^V 2 u k V^ 2 =1 (A>0), 

j=0 k=0 



(32) 



and we define 
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Ao = (1 - A)w + AV 2 £ 2 V t2 , (33a) 

= {1- + \V 2 Lu V f2 , (33b) 

A2 = (1 - A)d> 2) (33c) 

As = AV 2 ^iV t2 . (33d) 



(Note that o>o cx V 2 o>2V^ 2 and o>i cx V 2 d>oV^ 2 .) This gives a 4-element POVM {fij} which maximizes the Shannon 
mutual information for £5. 

The strategies in theorem 2 are not generally Zm— covariant but they correspond to extreme points of V . On the 
other hand the Zm— covariant strategy of theorem 1 is generally not an extreme point of V ■ The Zjvf-covaxiant POVM 
of theorem 1 can be related to the asymmetrical 3-element POVM of theorem 2 as follows. Note first that if W = {u>j} 
is any optimal POVM then so is mW = {V m u)jV^ m } for any m € Zm- Indeed 

I{£ M ■ W) - I{£ M ■ mW) (34) 

since the set of states of £m is invariant under the action of Zm- Given any one of the N (=2, 3)-element POVMs 
{(!)*} defined in theorem 2, one can consider the resolution of the identity 

1 M-1K-1 

-£ £ v»»d>;v*» = /. (35) 

m=0 j=0 

But the MN elements {V m tli* V' m } are proportional to each other in groups of N and these groups may each naturally 
be summed and assigned a single element. This leads to the covariant M-element POVM which is just Am of theorem 
1 and lemma 3. In this sense Am may be thought of as a convex combination 

keZ M 

where W is any one of the POVMs in theorem 2. If we know that Am is optimal then Eqs. (|34|) and (Q) will imply 
that W is optimal too. This provides an alternative proof of theorem 2 if we already know theorem 1 and lemma 
3. On the other hand, if conversely we are given the result of theorem 2 (which uses lemma 3) then the accessible 
information of £m must be so Am must be optimal (since I(£m '■ Am) — ![% ) by definition of 1(9) and Am)- 

V. IMPLEMENTATION 

The optimal POVMs Am and W given in theorems 1 and 2, may be of interest from the viewpoint of putting 
quantum detection theory to the test. None of the POVMs for attaining maximum mutual information have been 
demonstrated by experiment yet. So far, only two kinds of optimal quantum detection scenarios have been confirmed 
experimentally. One is the Helstrom bound as the minimum average error probability and the other is the 
Ivanovic-Dieks-Peres bound which gives the maximum probability for error-free detection, sometimes referred as the 
unambiguous measurement Jl^JT^Jl5|Jl^| . (A concise review of both criteria can be found in ref. |^7| .) The former 
scenario was first demonstrated experimentally by Barnett and Riis E§]. The latter has been demonstrated in the 
laboratory by Huttner et al. |p^| . Both of these are concerned with discrimination between binary nonorthogonal 
states. In our case of Am and W for £m with M odd, we are dealing with essentially nonorthogonal measurement 
vectors in H.2, which is called a generalized measurement. No von Neumann measurement can be an optimal strategy 
for £m with M odd. This case is of particular interest here. It is already well known that this kind of generalized 
measurement can be converted into a standard von Neumann measurement in a larger Hilbert space by introducing 
an ancillary system. This so-called Naimark extension ensures that any POVM can be physically implemented in 
principle 

In this section we propose an optical scheme to demonstrate the optimal POVMs specified by W for £m made of 
single mode photon polarization states. As seen in the previous section, W has three outcomes at most and suffices 
to provide an optimal strategy for all £m's. For M odd, it is always possible to find the optimal strategy with m = n, 
that is, a 2 = b 2 = l/(2sin 2 ^) in theorem 2 if m is taken as < m < -y-- We consider the implementation of this 
particular detection strategy. The measurement vectors can be represented by 
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\u* ) = -sin 1\ |), (36a) 
K)=--^|T) + ^cos|||), (36b) 
K) = -^|T) + ^cosJ||), (36c) 



where 



7 77J7T . 7 / „TO7T . . 

cos- = cot— — , sin- = — i/l - cotr — — , (37) 
2 M 2 V M 

and | t) and | J.) are orthonormal bases of polarization. The first step is to make orthogonal measurement vectors by 
embedding {|^o)j l^)} m t° a three or higher dimensional Hilbcrt space. One possible physical prescription is 
to make an optical circuit with two input ports, say, "a" and "b" . The signal state is guided into the port "a" , while 
the port "b" is initialized as the vacuum state. We can then consider the four dimensional Hilbcrt space spanned by 
the orthonormal basis {\Ej)}, 

\E Q ) = | T)a|0} 6 , (38a) 
\Ei) = | |}„|0) 6 , (38b) 
\E 2 ) = |0> o | T>6, (38c) 
\E 3 ) = |0) o | i) b , (38d) 

where |0) is the vacuum state and the subscripts a and b indicate the port "a" and "6", respectively. A natural 
orthogonalization is 

|fio) = K) a |0) b + cos||0) a | T>6, (39a) 

|fix) = K) a |0) h + -J=sin I|0) o | T)b, (39b) 

|fi 2 ) = H) a \0) b + -Lsin||0) a | T> 6 , (39c) 
l^ 3 > = \0) a \ i)b, (39d) 



or equivalcntly, 



|«o> = -sin|| |) a |0) b + cos J0) a | T)t, (40a) 

|fii> = ;j= (-1 T)a|0) b + cos|| 4> a |0) fc + sin||0) a | T) 6 ) , (40b) 

|fi 2 ) = (| t)a|0) 6 + cos|| |) a |0) b + sin l|0) o | T> 6 ) , (40c) 

|^3> = \0)a\ l)b- (40d) 



It is easy to check that {|Oo), |^2)} give the same channel matrix as |w 2 )}, that is, (wjlV'i) — 

(Oj|(|V'i)a|0)6) (j = 0,1,2). The second step is to decompose the von Neumann measurement {|%)} into a uni- 
tary transformation followed by a measurement in the basis in order to find a practical detector structure. We 
may write 

(fi | = (E 2 \U 2 Ui, (41a) 

(fill = (EilUzUu (41b) 

(Q 2 | = (E \U 2 U U (41c) 

(0 3 | = (E 3 \U 2 Ui, (41d) 

where U\ and U 2 are given by the matrices 
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cos^ sin 5^ 
— sin^ cos^ 







1 



(42) 



U 2 



\ 



i 

V2 






1 

t 






0\ 



1 

1/ 



(43) 



in the {\E ), I-E2), I-E3)} -basis representation. Eqs. ( |4l| ) mean that in the detector, the signal state \ipi) a \0)b is 
first transformed by U2U1, and is then measured in the basis which corresponds to the simultaneous measure- 

ment with respect to which-path and which- polarization. The final step is to translate U2U1 into a practical circuit. In 
fact, this unitary transformation can be effected by the simple circuit consisting of passive linear optical devices such 
as polarizing beam splitters, polarization rotators, and halfwave plates [^0|. The circuit is shown in Fig. ||. The U2U1 
part consists of four halfwave plates, two polarizing beam splitters, and two polarization rotators. The polarization 
rotator represented by the circle with the rotation angle 7 performs 



costt sin 5^ 
— sin5r cos^ 



(44) 



The polarizing beam splitter represented by the square functions as a perfect mirror only for {-polarization (fast axis 
polarization). Light polarized along {-polarization (slow axis polarization) passes straight through it perfectly. The 
measurement is made by photon counting at the four output ports. Note that only a single photon count at 

one of the three ports is expected and the outcome I-E3) is never expected. This structure is valid for any M (the 
number of the signals) if one tunes the rotation angle 7 in Ryi'y) according to the value of M (see Eq. (|37|)). The 
circuit is simple enough to be implemented with present technology. 




FIG. 4. The optical circuit implementing W = {0)0,^1,0)2}. It consists of the unitary transformation U2U1 followed by the 
measurement {!-&/)}• U2U1 is effected by four halfwave plates, two polarizing beam splitters, and two polarization rotators. 
The measurement {!-&;)} is made by photon counting at the four output ports. 



VI. CONCLUDING REMARKS 



We have considered optimal strategies for symmetrical sources of real quantum states, treating in detail the sources 
£m of M real qubit states placed symmetrically in the x — z plane around the Bloch sphere. Davies [|| has provided 
a general theorem characterising an optimal strategy for any G-covariant source whose group acts irreducibly on the 
whole state space. The symmetry group Zm of Em does not act irreducibly on that state space so Davies' theorem 
cannot be directly applied. However we proved an extension of this theorem which applies to G-covariant sources 
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of real states for which the group acts irreducibly on the subset of real states (as is the case for Em)- This led to a 
Zjvf-covariant optimal strategy Am for Em- 

We also derived alternative optimal strategies W which contain at most three real POVM elements. In deriving 
this strategy W we exploited the convexity of I(X : Y) on the convex set V of all POVMs. These strategies are not 
G-covariant in general but correspond to extreme points of V . The small number of elements can be advantageous 
for practical implementation of the detection strategies as seen in the preceding section. The G-covariant strategy 
is not generally an extreme point of V but for higher dimensions it would seem easier to derive explicit G-covariant 
solutions rather than extreme point solutions. 

Our results have added to the relatively small number of quantum sources for which optimal strategies are explicitly 
known. They may be extended in various straightforward ways (which we have omitted for clarity of presentation). 
For example the optimal strategies Am and W for Em remains optimal for the M-state source 

{(l-e)\i> k ){i> k \ + e~i 2 :k€Z M ;jj} 

where each pure signal has been corrupted by noise given by the maximally mixed state This mixed state 

ensemble is clearly also G-covariant and the process of deriving the optimal strategy for this ensemble is quite the 
same as in the pure state case (e = 0) but just multiplying the cosine terms in Eq. (|l7) by (1 — e). Then the same 
strategy remains optimal for the G-covariant mixed state ensemble although the accessible information decreases with 
e as expected. 

It is perhaps worth briefly contrasting our results of maximizing the mutual information with the problem of 
minimizing the average error probability. The latter is defined for Em and any M-element POVM by 



M-l 

k=a 

The P -optimal strategy is {^k} — {j^ IV>fc) (V'fcl : k £ Zm}, that is, the POVM based on the state directions 
themselves. This is true also for the above mixed state ensemble. (The necessary and sufficient conditions for P e - 
optimality, as given in |l|,0], are easily verified for {7Tfc}.) Generally P e -minimization is an essentially different type of 
optimization problem from I(X : V)-maximization. 

Within the confines of our formalism, various interesting issues remain unresolved. For example we would like to 
know an optimal strategy for the real Zjv/-covariant source "double-^Af " in 4 dimensions comprising the 2-qubit signal 
states {\4>k) IV'fc) : S Zm; tj}- In this case the symmetry group Zm does not act irreducibly even on the subset of 
all real 2-qubit states. Interesting properties of double-^ have been considered in from the viewpoint of coding 
gain of transmittable information. 

It is also a remaining difficult problem to optimize a quantum channel over both the a priori probability distribution 
of signals and the detection strategy for a fixed set of quantum states. The solution is known only for the binary pure 
state channel. 
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